Automatic quantification of tumor-stroma ratio as a prognostic marker for pancreatic cancer

Purpose This study aims to introduce an innovative multi-step pipeline for automatic tumor-stroma ratio (TSR) quantification as a potential prognostic marker for pancreatic cancer, addressing the limitations of existing staging systems and the lack of commonly used prognostic biomarkers. Methods The proposed approach involves a deep-learning-based method for the automatic segmentation of tumor epithelial cells, tumor bulk, and stroma from whole-slide images (WSIs). Models were trained using five-fold cross-validation and evaluated on an independent external test set. TSR was computed based on the segmented components. Additionally, TSR’s predictive value for six-month survival on the independent external dataset was assessed. Results Median Dice (inter-quartile range (IQR)) of 0.751(0.15) and 0.726(0.25) for tumor epithelium segmentation on internal and external test sets, respectively. Median Dice of 0.76(0.11) and 0.863(0.17) for tumor bulk segmentation on internal and external test sets, respectively. TSR was evaluated as an independent prognostic marker, demonstrating a cross-validation AUC of 0.61±0.12 for predicting six-month survival on the external dataset. Conclusion Our pipeline for automatic TSR quantification offers promising potential as a prognostic marker for pancreatic cancer. The results underscore the feasibility of computational biomarker discovery in enhancing patient outcome prediction, thus contributing to personalized patient management.

Reviewer's Comment #3: Can this protocol be expanded to accommodate co-stain of immune cell markers?I speculate that TSR together with immune scoring (e.g.TIL-tumor infiltrating lymphocytes) would be more prognosis-relevant, especially considering the ambiguities that a equal high proportion of TSR may mean immune-"hot" or immune-"cold" tumor microenvironment.
Response: A combination of CK18/8 and, for example, a CD3 IHC for lymphocytes would certainly be possible.A possible approach would see the extension of the first Unet to segment the epithelium and the TILs, while the second Unet would segment the tumor epithelium, the healthy epithelium and the TILs.Survival analysis would be done using the combination of the two biomarkers.This is an interesting approach and would certainly be considered in future work, but given the significant extra work and cost (recutting tissue blocks, restaining, and developing new models), we did not include these experiments in this work.We did include a few sentences in the discussion: "In future work, it would be interesting to explore additional biomarkers, such as the presence of tumour-infiltrating lymphocytes, in combination with TSR." Reviewer's Comment #4: No codebase and no trained model released on GitHub?I understand the imaging data can be sensitive to share, but I do expect some codebase and trained model availability as a computational paper.illustration using the publicly available TCGA PDAC dataset.I hope it's reasonable for anyone who wants to reproduce or reuse this work.

Response:
We thank the reviewer for this comment, and we have addressed this limitation by making available our Docker for public usage.It can be found at the following link: https://github.com/DIAGNijmegen/automatic-tsr-quantification-for-pdacReviewer's Comment #5: Maybe some statistical question: why not use a censored survival association model and report C-index instead of logistical regression?
Response: We expanded the survival analysis by adding the Kaplan-Meier estimator.Results of this have been added in the paper, section Results, subsection survival analysis.We added the following sentences in the paper: "Furthermore, we employed a Kaplan-Meier estimator, categorizing patients into high-risk and lowrisk groups based on the TSR value, with a threshold set at 0.73-chosen as it represented the mean TSR value.The results are displayed in Fig 7 .Although the Kaplan-Meier estimator indicates some differences in risks between patients, the significance level is not reached." Reviewer's Comment #8: It's better to spell out the full name of TSR in the title of the manuscript.

Response:
We have addressed the comment from the reviewer changing the name of the manuscript, including tumor-stroma ratio instead of TSR.
Reviewer's Comment #9: I don't quite understand the eligibility of dataset-B, why it requires both PDAC and history of other cancers?
Response: Dataset B comes from an existing research study into pancreatic cancer by researcher V.K.In that study they used the availability of the history of previous cancer as an inclusion criterion, which we mention for completeness.In our study we did not use this information in any way.
Editor's Comment a: If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a thirdparty organization, etc.) and who has imposed them (e.g., an ethics committee).Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

Response:
The data from datasets A, B, and C cannot be publicly shared without limitation due to the terms of the agreement in place, and the presence of sensible information in data.However, researchers who meet the criteria for accessing confidential data may request access by submitting a detailed request outlining the scope of their research.This request will be forwarded to the ethical committee for approval.As outlined in the paper, the study received approval from the local ethical committee of Radboudumc (CMO-2016-3045, CMO-2017-3780).
Editor's Comment b: If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers.
Response: All Data from dataset D are available from the National Cancer Institute database (accession number phs000178, link https://portal.gdc.cancer.gov/projects/TCGA-PAAD.).Dataset D was used to perform survival analysis, and researchers who want to replicate results can download the data from the public repository and apply the algorithm published on github as specified on response to Reviewer's Comment #4.
I have attached a revised version of the manuscript that incorporates the suggested changes.Additionally, I have included a marked-up copy highlighting the modifications made in response to each reviewer's comments.I believe that these revisions strengthen the manuscript, addressing the concerns raised during the review process.I am confident that the changes made enhance the overall quality and contribution of the paper.I would like to express my gratitude for the constructive feedback provided by the reviewers.Their insights have been instrumental in refining the manuscript, and I am hopeful that the revised version meets the standards of PLOS ONE.

Figure 7
Figure 7 Kaplan-Meier estimator.Stratification was done using the mean TSR value.
Thank you for your time and consideration.I look forward to hearing from you soon.Sincerely, Pierpaolo Vendittelli PhD Candidate Radboud University Medical Centre pierpaolo.vendittelli@radboudumc.nl